Fix in Completions API prompt #29

dmitripikus · 2025-07-30T13:01:20Z

When using completions API, for some contents of prompt, some models send response with empty text and one generated token, i.e. do not generate response as expected.
Example of problem:
Request:

curl -X POST http://localhost:8080/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/Llama-3.1-70B-Instruct",
    "prompt": "USER: Hi, here is some system prompt: hi .Here are some other context:  hi.Here is question #1: can?ASSISTANT: Hi",
    "max_tokens": 20,
    "stream": true,
    "stream_options": {
      "include_usage": true
    },
    "temperature": 1
  }'

Model: meta-llama/Llama-3.1-70B-Instruct

To case model to continue test generation, "\nASSISTANT:" suffix is to be added to the prompt

Siddhant-Ray · 2025-08-08T20:35:25Z

@Shaoting-Feng PTAL too

Shaoting-Feng

LGTM

Fix in Completions API prompt

a59fb91

Siddhant-Ray approved these changes Aug 8, 2025

View reviewed changes

Shaoting-Feng approved these changes Aug 8, 2025

View reviewed changes

Shaoting-Feng merged commit e140662 into LMCache:main Aug 8, 2025
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix in Completions API prompt #29

Fix in Completions API prompt #29

Uh oh!

dmitripikus commented Jul 30, 2025

Uh oh!

Siddhant-Ray commented Aug 8, 2025

Uh oh!

Shaoting-Feng left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix in Completions API prompt #29

Fix in Completions API prompt #29

Uh oh!

Conversation

dmitripikus commented Jul 30, 2025

Uh oh!

Siddhant-Ray commented Aug 8, 2025

Uh oh!

Shaoting-Feng left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants